IRIX Base Documentation 2002 November

home *** CD-ROM | disk | FTP | other *** search

/ IRIX Base Documentation 2002 November / SGI IRIX Base Documentation 2002 November.iso / usr / share / catman / p_man / cat1 / MPI.z / MPI

Wrap

Text File | 2002-10-03 | 76.1 KB | 1,387 lines

MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) NNNNAAAAMMMMEEEE MMMMPPPPIIII - Introduction to the Message Passing Interface (MPI) DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN The Message Passing Interface (MPI) is a component of the Message Passing Toolkit (MPT), which is a software package that supports parallel programming across a network of computer systems through a technique known as message passing. The goal of MPI, simply stated, is to develop a widely used standard for writing message-passing programs. As such, the interface establishes a practical, portable, efficient, and flexible standard for message passing. This MPI implementation supports the MPI 1.2 standard, as documented by the MPI Forum in the spring 1997 release of _M_P_I: _A _M_e_s_s_a_g_e _P_a_s_s_i_n_g _I_n_t_e_r_f_a_c_e _S_t_a_n_d_a_r_d. In addition, certain MPI-2 features are also supported. In designing MPI, the MPI Forum sought to make use of the most attractive features of a number of existing message passing systems, rather than selecting one of them and adopting it as the standard. Thus, MPI has been strongly influenced by work at the IBM T. J. Watson Research Center, Intel's NX/2, Express, nCUBE's Vertex, p4, and PARMACS. Other important contributions have come from Zipcode, Chimp, PVM, Chameleon, and PICL. For IRIX systems, MPI requires the presence of an Array Services daemon (aaaarrrrrrrraaaayyyydddd) on each host that is to run MPI processes. In a single-host environment, no system administration effort should be required beyond installing and activating aaaarrrrrrrraaaayyyydddd. However, users wishing to run MPI applications across multiple hosts will need to ensure that those hosts are properly configured into an array. For more information about Array Services, see the aaaarrrrrrrraaaayyyydddd(1M), aaaarrrrrrrraaaayyyydddd....ccccoooonnnnffff(4), and aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss(5) man pages. When running across multiple hosts, users must set up their ....rrrrhhhhoooossssttttssss files to enable remote logins. Note that MPI does not use rrrrsssshhhh, so it is not necessary that rrrrsssshhhhdddd be running on security-sensitive systems; the ....rrrrhhhhoooossssttttssss file was simply chosen to eliminate the need to learn yet another mechanism for enabling remote logins. Other sources of MPI information are as follows: * Man pages for MPI library functions * A copy of the MPI standard as PostScript or hypertext on the World Wide Web at the following URL: hhhhttttttttpppp::::////////wwwwwwwwwwww....mmmmppppiiii----ffffoooorrrruuuummmm....oooorrrrgggg//// * Other MPI resources on the World Wide Web, such as the following: hhhhttttttttpppp::::////////wwwwwwwwwwww....mmmmccccssss....aaaannnnllll....ggggoooovvvv////mmmmppppiiii////iiiinnnnddddeeeexxxx....hhhhttttmmmmllll hhhhttttttttpppp::::////////wwwwwwwwwwww....eeeerrrrcccc....mmmmssssssssttttaaaatttteeee....eeeedddduuuu////mmmmppppiiii////iiiinnnnddddeeeexxxx....hhhhttttmmmmllll PPPPaaaaggggeeee 1111 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) hhhhttttttttpppp::::////////wwwwwwwwwwww....mmmmppppiiii....nnnndddd....eeeedddduuuu////llllaaaammmm//// GGGGeeeettttttttiiiinnnngggg SSSSttttaaaarrrrtttteeeedddd For IRIX systems, the MMMMoooodddduuuulllleeeessss software package is available to support one or more installations of MPT. To use the MPT software, load the desired mmmmpppptttt module. After you have initialized modules, enter the following command: mmmmoooodddduuuulllleeee llllooooaaaadddd mmmmpppptttt To unload the mmmmpppptttt module, enter the following command: mmmmoooodddduuuulllleeee uuuunnnnllllooooaaaadddd mmmmpppptttt MPT software can be installed in an alternate location for use with the modules software package. If MPT software has been installed on your system for use with modules, you can access the software with the mmmmoooodddduuuulllleeee command shown in the previous example. If MPT has not been installed for use with modules, the software resides in default locations on your system (////uuuussssrrrr////iiiinnnncccclllluuuuddddeeee, ////uuuussssrrrr////lllliiiibbbb, /usr/array/PVM, and so on), as in previous releases. For further information, see IIIInnnnssssttttaaaalllllllliiiinnnngggg MMMMPPPPTTTT ffffoooorrrr UUUUsssseeee wwwwiiiitttthhhh MMMMoooodddduuuulllleeeessss, in the Modules relnotes. UUUUssssiiiinnnngggg MMMMPPPPIIII Compile and link your MPI program as shown in the following examples. IRIX systems: To use the 64-bit MPI library, choose one of the following commands: cccccccc ----66664444 ccccoooommmmppppuuuutttteeee....cccc ----llllmmmmppppiiii ffff77777777 ----66664444 ----LLLLAAAANNNNGGGG::::rrrreeeeccccuuuurrrrssssiiiivvvveeee====oooonnnn ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii ffff99990000 ----66664444 ----LLLLAAAANNNNGGGG::::rrrreeeeccccuuuurrrrssssiiiivvvveeee====oooonnnn ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii CCCCCCCC ----66664444 ccccoooommmmppppuuuutttteeee....CCCC ----llllmmmmppppiiii++++++++ ----llllmmmmppppiiii To use the 32-bit MPI library, choose one of the following commands: cccccccc ----nnnn33332222 ccccoooommmmppppuuuutttteeee....cccc ----llllmmmmppppiiii ffff77777777 ----nnnn33332222 ----LLLLAAAANNNNGGGG::::rrrreeeeccccuuuurrrrssssiiiivvvveeee====oooonnnn ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii ffff99990000 ----nnnn33332222 ----LLLLAAAANNNNGGGG::::rrrreeeeccccuuuurrrrssssiiiivvvveeee====oooonnnn ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii CCCCCCCC ----nnnn33332222 ccccoooommmmppppuuuutttteeee....CCCC ----llllmmmmppppiiii++++++++ ----llllmmmmppppiiii Linux systems: PPPPaaaaggggeeee 2222 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) To use the 64-bit MPI library on Linux IA64 systems, choose one of the following commands: gggg++++++++ ----oooo mmmmyyyypppprrrroooogggg mmmmyyyypppprrrroooocccc....CCCC ----llllmmmmppppiiii++++++++ ----llllmmmmppppiiii ggggcccccccc ----oooo mmmmyyyypppprrrroooogggg mmmmyyyypppprrrroooogggg....cccc ----llllmmmmppppiiii For IRIX systems, if Fortran 90 compiler 7.2.1 or higher is installed, you can add the ----aaaauuuuttttoooo____uuuusssseeee option as follows to get compile-time checking of MPI subroutine calls: ffff99990000 ----aaaauuuuttttoooo____uuuusssseeee mmmmppppiiii____iiiinnnntttteeeerrrrffffaaaacccceeee ----66664444 ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii ffff99990000 ----aaaauuuuttttoooo____uuuusssseeee mmmmppppiiii____iiiinnnntttteeeerrrrffffaaaacccceeee ----nnnn33332222 ccccoooommmmppppuuuutttteeee....ffff ----llllmmmmppppiiii For IRIX with MPT version 1.4 or higher, the Fortran 90 UUUUSSSSEEEE MMMMPPPPIIII feature is supported. You can replace the iiiinnnncccclllluuuuddddeeee ''''mmmmppppiiiiffff....hhhh'''' statement in your Fortran 90 source code with UUUUSSSSEEEE MMMMPPPPIIII. This facility includes MPI type and parameter definitions, and performs compile-time checking of MPI function and subroutine calls. NOTE: Do not use the Fortran 90 ----aaaauuuuttttoooo____uuuusssseeee mmmmppppiiii____iiiinnnntttteeeerrrrffffaaaacccceeee option to compile IRIX Fortran 90 source code that contains the UUUUSSSSEEEE MMMMPPPPIIII statement. They are incompatible with each other. For IRIX systems, applications compiled under a previous release of MPI should not require recompilation to run under this new (3.3) release. However, it is not possible for executable files running under the 3.2 release to interoperate with others running under the 3.3 release. The C version of the MMMMPPPPIIII____IIIInnnniiiitttt(3) routine ignores the arguments that are passed to it and does not modify them. SSSSttttddddiiiinnnn is enabled only for those MPI processes with rank 0 in the first MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD (which does not need to be located on the same host as mmmmppppiiiirrrruuuunnnn). SSSSttttddddoooouuuutttt and ssssttttddddeeeerrrrrrrr results are enabled for all MPI processes in the job, whether launched via mmmmppppiiiirrrruuuunnnn, or one of the MPI-2 spawn functions. This version of the IRIX MPI implementation is compatible with the sssspppprrrroooocccc system call and can therefore coexist with ddddooooaaaaccccrrrroooossssssss loops. However, on Linux systems, the MPI library is not thread safe. Therefore, calls to MPI routines in a multithreaded application will require some form of mutual exclusion. The MMMMPPPPIIII____IIIInnnniiiitttt____tttthhhhrrrreeeeaaaadddd call can be used to request thread safety. For IRIX and Linux systems, this implementation of MPI requires that all MPI processes call MMMMPPPPIIII____FFFFiiiinnnnaaaalllliiiizzzzeeee eventually. PPPPaaaaggggeeee 3333 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) BBBBuuuuffffffffeeeerrrriiiinnnngggg For IRIX systems, the current implementation buffers messages unless the message from the sending process resides in the symmetric data, symmetric heap, or global heap segment. Buffered messages are grouped into two classes based on length: short (messages with lengths of 64 bytes or less) and long (messages with lengths greater than 64 bytes). For more information on using the symmetric data, symmetric heap, or global heap, see the MPI_BUFFER_MAX environment variable. MMMMyyyyrrrriiiinnnneeeetttt ((((GGGGMMMM)))) SSSSuuuuppppppppoooorrrrtttt This release provides support for use of the GM protocol over Myrinet interconnects on IRIX systems. Support is currently limited to 64-bit applications. UUUUssssiiiinnnngggg MMMMPPPPIIII wwwwiiiitttthhhh ccccppppuuuusssseeeettttssss You can use cpusets to run MPI applications (see ccccppppuuuusssseeeetttt(4)). However, it is highly recommended that the cpuset have the MMMMEEEEMMMMOOOORRRRYYYY____LLLLOOOOCCCCAAAALLLL attribute. On Origin systems, if this attribute is not used, you should disable NUMA optimizations (see the MMMMPPPPIIII____DDDDSSSSMMMM____OOOOFFFFFFFF environment variable description in the following section). DDDDeeeeffffaaaauuuulllltttt IIIInnnntttteeeerrrrccccoooonnnnnnnneeeecccctttt SSSSeeeelllleeeeccccttttiiiioooonnnn Beginning with the MPT 1.6 release, the search algorithm for selecting a multi-host interconnect has been significantly modified. By default, if MPI is being run across multiple hosts, or if multiple binaries are specified on the mmmmppppiiiirrrruuuunnnn command, the software now searches for interconnects in the following order (for IRIX systems): 1) XPMEM (NUMAlink - only available on partitioned systems) 2) GSN 3) MYRINET 4) HIPPI 800 5) TCP/IP The only supported interconnect on Linux systems is TCP/IP. MPI uses the first interconnect it can detect and configure correctly. There will only be one interconnect configured for the entire MPI job, with the exception of XPMEM. If XPMEM is found on some hosts, but not on others, one additional interconnect is selected. The user can specify a mandatory interconnect to use by setting one of the following new environment variables. These variables will be assessed in the following order: 1) MPI_USE_XPMEM 2) MPI_USE_GSN 3) MPI_USE_GM PPPPaaaaggggeeee 4444 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) 4) MPI_USE_HIPPI 5) MPI_USE_TCP For a mandatory interconnect to be used, all of the hosts on the mmmmppppiiiirrrruuuunnnn command line must be connected via the device, and the interconnect must be configured properly. If this is not the case, an error message is printed to stdout and the job is terminated. XPMEM is an exception to this rule, however. If MMMMPPPPIIII____UUUUSSSSEEEE____XXXXPPPPMMMMEEEEMMMM is set, one additional interconnect can be selected via the MMMMPPPPIIII____UUUUSSSSEEEE variables. Messaging between the partitioned hosts will use the XPMEM driver while messaging between non-partitioned hosts will use the second interconnect. If a second interconnect is required but not selected by the user, MPI will choose the interconnect to use, based on the default hierarchy. If the global ----vvvv verbose option is used on the mmmmppppiiiirrrruuuunnnn command line, a message is printed to stdout, indicating which multi-host interconnect is being used for the job. The following interconnect selection environment variables have been deprecated in the MPT 1.6 release: MMMMPPPPIIII____GGGGSSSSNNNN____OOOONNNN, MMMMPPPPIIII____GGGGMMMM____OOOONNNN, and MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____OOOOFFFFFFFF. If any of these variables are set, MPI prints a warning message to stdout. The meanings of these variables are ignored. UUUUssssiiiinnnngggg MMMMPPPPIIII----2222 PPPPrrrroooocccceeeessssssss CCCCrrrreeeeaaaattttiiiioooonnnn aaaannnndddd MMMMaaaannnnaaaaggggeeeemmmmeeeennnntttt RRRRoooouuuuttttiiiinnnneeeessss This release provides support for MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn and MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn____mmmmuuuullllttttiiiipppplllleeee on IRIX systems. However, options must be included on the mmmmppppiiiirrrruuuunnnn command line to enable this feature. When these options are present on the mmmmppppiiiirrrruuuunnnn command line, the MMMMPPPPIIII job is running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode. In this release, these spawn features are only available for jobs confined to a single IRIX host. EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS This section describes the variables that specify the environment under which your MPI programs will run. Unless otherwise specified, these variables are available for both Linux and IRIX systems. Environment variables have predefined values. You can change some variables to achieve particular performance objectives; others are required values for standard-compliant programs. MMMMPPPPIIII____AAAARRRRRRRRAAAAYYYY (IRIX systems only) Sets an alternative array name to be used for communicating with Array Services when a job is being launched. Default: The default name set in the aaaarrrrrrrraaaayyyydddd....ccccoooonnnnffff file MMMMPPPPIIII____BBBBAAAARRRR____CCCCOOOOUUUUNNNNTTTTEEEERRRR (IRIX systems only) Specifies the use of a simple counter barrier algorithm within the MMMMPPPPIIII____BBBBaaaarrrrrrrriiiieeeerrrr(3) and MMMMPPPPIIII____WWWWiiiinnnn____ffffeeeennnncccceeee(3) functions. PPPPaaaaggggeeee 5555 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) Default: Not enabled if job contains more than 64 PEs. MMMMPPPPIIII____BBBBAAAARRRR____DDDDIIIISSSSSSSSEEEEMMMM (IRIX systems only) Specifies the use of the alternate barrier algorithm, the dissemination/butterfly, within the MMMMPPPPIIII____BBBBaaaarrrrrrrriiiieeeerrrr(3) and MMMMPPPPIIII____WWWWiiiinnnn____ffffeeeennnncccceeee(3) functions. This alternate algorithm provides better performance on jobs with larger PE counts. The MMMMPPPPIIII____BBBBAAAARRRR____DDDDIIIISSSSSSSSEEEEMMMM option is recommended for jobs with PE counts of 64 or higher. Default: Disabled if job contains less than 64 PEs; otherwise, enabled. MMMMPPPPIIII____BBBBUUUUFFFFFFFFEEEERRRR____MMMMAAAAXXXX (IRIX systems only) Specifies a minimum message size, in bytes, for which the message will be considered a candidate for single-copy transfer. Currently, this mechanism is available only for communication between MPI processes on the same host. The sender data must reside in either the symmetric data, symmetric heap, or global heap. The MPI data type on the send side must also be a contiguous type. If the XPMEM driver is enabled (for single host jobs, see MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____OOOONNNN and for multihost jobs, see MMMMPPPPIIII____UUUUSSSSEEEE____XXXXPPPPMMMMEEEEMMMM), MPI allows single-copy transfers for basic predefined MPI data types from any sender data location, including the stack and private heap. The XPMEM driver also allows single-copy transfers across partitions. If cross mapping of data segments is enabled at job startup, data in common blocks will reside in the symmetric data segment. On systems running IRIX 6.5.2 or higher, this feature is enabled by default. You can employ the symmetric heap by using the sssshhhhmmmmaaaalllllllloooocccc(sssshhhhppppaaaalllllllloooocccc) functions available in LIBSMA. Testing of this feature has indicated that most MPI applications benefit more from buffering of medium-sized messages than from buffering of large size messages, even though buffering of medium- sized messages requires an extra copy of data. However, highly synchronized applications that perform large message transfers can benefit from the single-copy pathway. Default: Not enabled MMMMPPPPIIII____BBBBUUUUFFFFSSSS____PPPPEEEERRRR____HHHHOOOOSSSSTTTT Determines the number of shared message buffers (16 KB each) that MPI is to allocate for each host. These buffers are used to send long messages and interhost messages. Default: 32 pages (1 page = 16KB) MMMMPPPPIIII____BBBBUUUUFFFFSSSS____PPPPEEEERRRR____PPPPRRRROOOOCCCC Determines the number of private message buffers (16 KB each) that MPI is to allocate for each process. These buffers are used to send long messages and intrahost messages. PPPPaaaaggggeeee 6666 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) Default: 32 pages (1 page = 16KB) MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____CCCCRRRRCCCC (IRIX systems only) Adds a checksum to each long message sent via HiPPI bypass. If the checksum does not match the data received, the job is terminated. Use of this environment variable might degrade performance. Default: Not set MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____DDDDEEEEVVVV____SSSSEEEELLLLEEEECCCCTTTTIIIIOOOONNNN (IRIX systems only) Specifies the algorithm MPI is to use for sending messages over multiple HIPPI adapters. Set this variable to one of the following values: VVVVaaaalllluuuueeee AAAAccccttttiiiioooonnnn 0000 Static device selection. In this case, a process is assigned a HIPPI device to use for communication with processes on another host. The process uses only this HIPPI device to communicate with another host. This algorithm has been observed to be effective when interhost communication patterns are dominated by large messages (significantly more than 16K bytes). 1111 Dynamic device selection. In this case, a process can select from any of the devices available for communication between any given pair of hosts. The first device that is not being used by another process is selected. This algorithm has been found to work best for applications in which multiple processes are trying to send medium-sized messages (16K or fewer bytes) between processes on different hosts. Large messages (more than 16K bytes) are split into chunks of 16K bytes. Different chunks can be sent over different HIPPI devices. 2222 Round robin device selection. In this case, each process sends successive messages over a different HIPPI 800 device. Default: 1 MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____DDDDEEEEVVVVSSSS (IRIX systems only) Sets the order for opening HiPPI adapters. The list of devices does not need to be space-delimited (0123 is valid). A maximum of 16 adapters are supported on a single host. To reference adapters 10 through 15, use the letters a through f or A through F, respectively. An array node usually has at least one HiPPI adapter, the interface PPPPaaaaggggeeee 7777 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) to the HiPPI network. The HiPPI bypass is a lower software layer that interfaces directly to this adapter. When you know that a system has multiple HiPPI adapters, you can use the MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____DDDDEEEEVVVVSSSS variable to specify the adapter that a program opens first. This variable can be used to ensure that multiple MPI programs distribute their traffic across the available adapters. If you prefer not to use the HiPPI bypass, you can turn it off by setting the MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____OOOOFFFFFFFF variable. When a HiPPI adapter reaches its maximum capacity of four MPI programs, it is not available to additional MPI programs. If all HiPPI adapters are busy, MPI sends internode messages by using TCP over the adapter instead of the bypass. Default: MPI will use all available HiPPI devices MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____SSSSIIIINNNNGGGGLLLLEEEE (IRIX systems only) Allows MPI messages to be sent over multiple HiPPI connections if multiple connections are available. The HiPPI OS bypass multiboard feature is enabled by default. This environment variable disables it. When you set this variable, MPI operates as it did in previous releases, with use of a single HiPPI adapter connection, if available. Default: Not enabled MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____VVVVEEEERRRRBBBBOOOOSSSSEEEE (IRIX systems only) Allows additional MPI initialization information to be printed in the standard output stream. This information contains details about the HiPPI OS bypass connections and the HiPPI adapters that are detected on each of the hosts. Default: Not enabled MMMMPPPPIIII____CCCCHHHHEEEECCCCKKKK____AAAARRRRGGGGSSSS Enables checking of MPI function arguments. Segmentation faults might occur if bad arguments are passed to MPI, so this is useful for debugging purposes. Using argument checking adds several microseconds to latency. Default: Not enabled MMMMPPPPIIII____CCCCOOOOMMMMMMMM____MMMMAAAAXXXX Sets the maximum number of communicators that can be used in an MPI program. Use this variable to increase internal default limits. (Might be required by standard-compliant programs.) MPI generates an error message if this limit (or the default, if not set) is exceeded. Default: 256 PPPPaaaaggggeeee 8888 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____DDDDIIIIRRRR Sets the working directory on a host. When an mmmmppppiiiirrrruuuunnnn(1) command is issued, the Array Services daemon on the local or distributed node responds by creating a user session and starting the required MPI processes. The user ID for the session is that of the user who invokes mmmmppppiiiirrrruuuunnnn, so this user must be listed in the ....rrrrhhhhoooossssttttssss file on the corresponding nodes. By default, the working directory for the session is the user's $$$$HHHHOOOOMMMMEEEE directory on each node. You can direct all nodes to a different directory (an NFS directory that is available to all nodes, for example) by setting the MMMMPPPPIIII____DDDDIIIIRRRR variable to a different directory. Default: $$$$HHHHOOOOMMMMEEEE on the node. If using the ----nnnnpppp or ----nnnntttt option of mmmmppppiiiirrrruuuunnnn(1), the default is the current directory. MMMMPPPPIIII____DDDDPPPPLLLLAAAACCCCEEEE____IIIINNNNTTTTEEEERRRROOOOPPPP____OOOOFFFFFFFF (IRIX systems only) Disables an MPI/dplace interoperability feature available beginning with IRIX 6.5.13. By setting this variable, you can obtain the behavior of MPI with dplace on older releases of IRIX. Default: Not enabled MMMMPPPPIIII____DDDDSSSSMMMM____CCCCPPPPUUUULLLLIIIISSSSTTTT (IRIX systems only) Specifies a list of CPUs on which to run an MPI application. To ensure that processes are linked to CPUs, this variable should be used in conjunction with the MMMMPPPPIIII____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN variable. For an explanation of the syntax for this environment variable, see the section titled "Using a CPU List." MMMMPPPPIIII____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN (IRIX systems only) Enforces memory locality for MPI processes. Use of this feature ensures that each MPI process will get a CPU and physical memory on the node to which it was originally assigned. This variable has been observed to improve program performance on IRIX systems running release 6.5.7 and earlier, when running a program on a quiet system. With later IRIX releases, under certain circumstances, setting this variable is not necessary. Internally, this feature directs the library to use the pppprrrroooocccceeeessssssss____ccccppppuuuulllliiiinnnnkkkk(3) function instead of pppprrrroooocccceeeessssssss____mmmmllllddddlllliiiinnnnkkkk(3) to control memory placement. MMMMPPPPIIII____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN should not be used when the job is submitted to miser (see mmmmiiiisssseeeerrrr____ssssuuuubbbbmmmmiiiitttt(1)) because program hangs may result. The pppprrrroooocccceeeessssssss____ccccppppuuuulllliiiinnnnkkkk(3) function is inherited across process ffffoooorrrrkkkk(2) or sssspppprrrroooocccc(2). For this reason, when using mixed MPI/OpenMP applications, it is recommended either that this variable not be set, or that ____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN also be set (see ppppeeee____eeeennnnvvvviiiirrrroooonnnn(5)). Default: Not enabled PPPPaaaaggggeeee 9999 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____DDDDSSSSMMMM____OOOOFFFFFFFF (IRIX systems only) Turns off nonuniform memory access (NUMA) optimization in the MPI library. Default: Not enabled MMMMPPPPIIII____DDDDSSSSMMMM____PPPPLLLLAAAACCCCEEEEMMMMEEEENNNNTTTT (IRIX systems only) Specifies the default placement policy to be used for the stack and data segments of an MPI process. Set this variable to one of the following values: VVVVaaaalllluuuueeee AAAAccccttttiiiioooonnnn ffffiiiirrrrssssttttttttoooouuuucccchhhh With this policy, IRIX attempts to satisfy requests for new memory pages for stack, data, and heap memory on the node where the requesting process is currently scheduled. ffffiiiixxxxeeeedddd With this policy, IRIX attempts to satisfy requests for new memory pages for stack, data, and heap memory on the node associated with the memory locality domain (mld) with which an MPI process was linked at job startup. This is the default policy for MPI processes. rrrroooouuuunnnnddddrrrroooobbbbiiiinnnn With this policy, IRIX attempts to satisfy requests for new memory pages in a round robin fashion across all of the nodes associated with the MPI job. It is generally not recommended to use this setting. tttthhhhrrrreeeeaaaaddddrrrroooouuuunnnnddddrrrroooobbbbiiiinnnn This policy is intended for use with hybrid MPI/OpenMP applications only. With this policy, IRIX attempts to satisfy requests for new memory pages for the MPI process stack, data, and heap memory in a roundrobin fashion across the nodes allocated to its OpenMP threads. This placement option might be helpful for large OpenMP/MPI process ratios. For non-OpenMP applications, this value is ignored. Default: ffffiiiixxxxeeeedddd MMMMPPPPIIII____DDDDSSSSMMMM____PPPPPPPPMMMM (IRIX systems only) Sets the number of MPI processes per memory locality domain (mld). For Origin 2000 systems, values of 1 or 2 are allowed. For Origin 3000 and Origin 300 systems, values of 1, 2, or 4 are allowed. Default: Origin 2000 systems, 2; Origin 3000 and Origin 300 systems, 4. PPPPaaaaggggeeee 11110000 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____DDDDSSSSMMMM____TTTTOOOOPPPPOOOOLLLLOOOOGGGGYYYY (IRIX systems only) Specifies the shape of the set of hardware nodes on which the PE memories are allocated. Set this variable to one of the following values: VVVVaaaalllluuuueeee AAAAccccttttiiiioooonnnn ccccuuuubbbbeeee A group of memory nodes that form a perfect hypercube. The number of processes per host must be a power of 2. If a perfect hypercube is unavailable, a less restrictive placement will be used. ccccuuuubbbbeeee____ffffiiiixxxxeeeedddd A group of memory nodes that form a perfect hypercube. The number of processes per host must be a power of 2. If a perfect hypercube is unavailable, the placement will fail, disabling NUMA placement. ccccppppuuuucccclllluuuusssstttteeeerrrr Any group of memory nodes. The operating system attempts to place the group numbers close to one another, taking into account nodes with disabled processors. (Default for Irix 6.5.11 and higher). ffffrrrreeeeeeee Any group of memory nodes. The operating system attempts to place the group numbers close to one another. (Default for Irix 6.5.10 and earler releases). MMMMPPPPIIII____DDDDSSSSMMMM____VVVVEEEERRRRBBBBOOOOSSSSEEEE (IRIX systems only) Instructs mmmmppppiiiirrrruuuunnnn(1) to print information about process placement for jobs running on nonuniform memory access (NUMA) machines (unless MMMMPPPPIIII____DDDDSSSSMMMM____OOOOFFFFFFFF is also set). Output is sent to ssssttttddddeeeerrrrrrrr. Default: Not enabled MMMMPPPPIIII____DDDDSSSSMMMM____VVVVEEEERRRRIIIIFFFFYYYY (IRIX systems only) Instructs mmmmppppiiiirrrruuuunnnn(1) to run some diagnostic checks on proper memory placement of MPI data structures at job startup. If errors are found, a diagnostic message is printed to ssssttttddddeeeerrrrrrrr. Default: Not enabled MMMMPPPPIIII____GGGGMMMM____DDDDEEEEVVVVSSSS (IRIX systems only) Sets the order for opening GM(Myrinet) adapters. The list of devices does not need to be space-delimited (0321 is valid). The syntax is the same as for the MMMMPPPPIIII____BBBBYYYYPPPPAAAASSSSSSSS____DDDDEEEEVVVVSSSS environment variable. In this release, a maximum of 8 adpaters are supported on a single host. Default: MPI will use all available GM(Myrinet) devices. PPPPaaaaggggeeee 11111111 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____GGGGMMMM____VVVVEEEERRRRBBBBOOOOSSSSEEEE Setting this variable allows some diagnostic information concerning messaging between processes using GM (Myrinet) to be displayed on stderr. Default: Not enabled MMMMPPPPIIII____GGGGRRRROOOOUUUUPPPP____MMMMAAAAXXXX Determines the maximum number of groups that can simultaneously exist for any single MPI process. Use this variable to increase internal default limits. (This variable might be required by standard-compliant programs.) MPI generates an error message if this limit (or the default, if not set) is exceeded. Default: 32 MMMMPPPPIIII____GGGGSSSSNNNN____DDDDEEEEVVVVSSSS (IRIX 6.5.12 systems or later) Sets the order for opening GSN adapters. The list of devices does not need to be quoted or space-delimited (0123 is valid). Default: MPI will use all available GSN devices MMMMPPPPIIII____GGGGSSSSNNNN____VVVVEEEERRRRBBBBOOOOSSSSEEEE (IRIX 6.5.12 systems or later) Allows additional MPI initialization information to be printed in the standard output stream. This information contains details about the GSN (ST protocol) OS bypass connections and the GSN adapters that are detected on each of the hosts. Default: Not enabled MMMMPPPPIIII____MMMMSSSSGGGG____RRRREEEETTTTRRRRIIIIEEEESSSS Specifies the number of times the MPI library will try to get a message header, if none are available. Each MPI message that is sent requires an initial message header. If one is not available after MMMMPPPPIIII____MMMMSSSSGGGG____RRRREEEETTTTRRRRIIIIEEEESSSS, the job will abort. Note that this variable no longer applies to processes on the same host, or when using the GM (Myrinet) protocol. In these cases, message headers are allocated dynamically on an as-needed basis. Default: 500 MMMMPPPPIIII____MMMMSSSSGGGGSSSS____MMMMAAAAXXXX This variable can be set to control the total number of message headers that can be allocated. This allocation applies to messages exchanged between processes on a single host, or between processes on different hosts when using the GM(Myrinet) OS bypass protocol. Note that the initial allocation of memory for message headers is 128 Kbytes. Default: Allow up to 64 Mbytes to be allocated for message headers. If you set this variable, specify the maximum number of message PPPPaaaaggggeeee 11112222 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) headers. MMMMPPPPIIII____MMMMSSSSGGGGSSSS____PPPPEEEERRRR____HHHHOOOOSSSSTTTT Sets the number of message headers to allocate for MPI messages on each MPI host. Space for messages that are destined for a process on a different host is allocated as shared memory on the host on which the sending processes are located. MPI locks these pages in memory. Use the MMMMPPPPIIII____MMMMSSSSGGGGSSSS____PPPPEEEERRRR____HHHHOOOOSSSSTTTT variable to allocate buffer space for interhost messages. Caution: If you set the memory pool for interhost packets to a large value, you can cause allocation of so much locked memory that total system performance is degraded. The previous description does not apply to processes that use the GM(Myrinet) OS bypass protocol. In this case, message headers are allocated dynamically as needed. See the MMMMPPPPIIII____MMMMSSSSGGGGSSSS____MMMMAAAAXXXX variable description. Default: 1024 messages MMMMPPPPIIII____MMMMSSSSGGGGSSSS____PPPPEEEERRRR____PPPPRRRROOOOCCCC This variable is effectively obsolete. Message headers are now allocated on an as needed basis for messaging either between processes on the same host, or between processes on different hosts when using the GM (Myrinet) OS bypass protocol. The new MMMMPPPPIIII____MMMMSSSSGGGGSSSS____MMMMAAAAXXXX variable can be used to control the total number of message headers that can be allocated. Default: 1024 MMMMPPPPIIII____OOOOPPPPEEEENNNNMMMMPPPP____IIIINNNNTTTTEEEERRRROOOOPPPP (IRIX systems only) Setting this variable modifies the placement of MPI processes to better accomodate the OpenMP threads associated with each process. For more information, see the section titled UUUUssssiiiinnnngggg MMMMPPPPIIII wwwwiiiitttthhhh OOOOppppeeeennnnMMMMPPPP. NOTE: This option is available only on Origin 300 and Origin 3000 servers. Default: Not enabled MMMMPPPPIIII____RRRREEEEQQQQUUUUEEEESSSSTTTT____MMMMAAAAXXXX Determines the maximum number of nonblocking sends and receives that can simultaneously exist for any single MPI process. Use this variable to increase internal default limits. (This variable might be required by standard-compliant programs.) MPI generates an error message if this limit (or the default, if not set) is exceeded. Default: 16384 PPPPaaaaggggeeee 11113333 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____SSSSHHHHAAAARRRREEEEDDDD____VVVVEEEERRRRBBBBOOOOSSSSEEEE Setting this variable allows for some diagnostic information concerning messaging within a host to be displayed on stderr. Default: Not enabled MMMMPPPPIIII____SSSSLLLLAAAAVVVVEEEE____DDDDEEEEBBBBUUUUGGGG____AAAATTTTTTTTAAAACCCCHHHH Specifies the MPI process to be debugged. If you set MMMMPPPPIIII____SSSSLLLLAAAAVVVVEEEE____DDDDEEEEBBBBUUUUGGGG____AAAATTTTTTTTAAAACCCCHHHH to NNNN, the MPI process with rank NNNN prints a message during program startup, describing how to attach to it from another window using the dbx debugger on IRIX or the gdb debugger on Linux. You must attach the debugger to process NNNN within ten seconds of the printing of the message. MMMMPPPPIIII____SSSSTTTTAAAATTTTIIIICCCC____NNNNOOOO____MMMMAAAAPPPP (IRIX systems only) Disables cross mapping of static memory between MPI processes. This variable can be set to reduce the significant MPI job startup and shutdown time that can be observed for jobs involving more than 512 processors on a single IRIX host. Note that setting this shell variable disables certain internal MPI optimizations and also restricts the usage of MPI-2 one-sided functions. For more information, see the MMMMPPPPIIII____WWWWiiiinnnn man page. Default: Not enabled MMMMPPPPIIII____SSSSTTTTAAAATTTTSSSS Enables printing of MPI internal statistics. Each MPI process prints statistics about the amount of data sent with MPI calls during the MMMMPPPPIIII____FFFFiiiinnnnaaaalllliiiizzzzeeee process. Data is sent to ssssttttddddeeeerrrrrrrr. To prefix the statistics messages with the MPI rank, use the ----pppp option on the mmmmppppiiiirrrruuuunnnn command. For additional information, see the MMMMPPPPIIII____SSSSGGGGIIII____ssssttttaaaattttssss man page. NOTE: Because the statistics-collection code is not thread-safe, this variable should not be set if the program uses threads. Default: Not enabled MMMMPPPPIIII____TTTTYYYYPPPPEEEE____DDDDEEEEPPPPTTTTHHHH Sets the maximum number of nesting levels for derived data types. (Might be required by standard-compliant programs.) The MMMMPPPPIIII____TTTTYYYYPPPPEEEE____DDDDEEEEPPPPTTTTHHHH variable limits the maximum depth of derived data types that an application can create. MPI generates an error message if this limit (or the default, if not set) is exceeded. Default: 8 levels MMMMPPPPIIII____TTTTYYYYPPPPEEEE____MMMMAAAAXXXX Determines the maximum number of data types that can simultaneously exist for any single MPI process. Use this variable to increase internal default limits. (This variable might be required by standard-compliant programs.) MPI generates an error message if PPPPaaaaggggeeee 11114444 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) this limit (or the default, if not set) is exceeded. Default: 1024 MMMMPPPPIIII____UUUUNNNNBBBBUUUUFFFFFFFFEEEERRRREEEEDDDD____SSSSTTTTDDDDIIIIOOOO Normally, mmmmppppiiiirrrruuuunnnn line-buffers output received from the MPI processes on both the ssssttttddddoooouuuutttt and ssssttttddddeeeerrrrrrrr standard IO streams. This prevents lines of text from different processes from possibly being merged into one line, and allows use of the mmmmppppiiiirrrruuuunnnn ----pppprrrreeeeffffiiiixxxx option. Of course, there is a limit to the amount of buffer space that mmmmppppiiiirrrruuuunnnn has available (currently, about 8,100 characters can appear between new line characters per stream per process). If more characters are emitted before a new line character, the MPI program will abort with an error message. Setting the MMMMPPPPIIII____UUUUNNNNBBBBUUUUFFFFFFFFEEEERRRREEEEDDDD____SSSSTTTTDDDDIIIIOOOO environment variable disables this buffering. This is useful, for example, when a program's rank 0 emits a series of periods over time to indicate progress of the program. With buffering, the entire line of periods will be output only when the new line character is seen. Without buffering, each period will be immediately displayed as soon as mmmmppppiiiirrrruuuunnnn receives it from the MPI program. (Note that the MPI program still needs to call fffffffflllluuuusssshhhh(3) or FFFFLLLLUUUUSSSSHHHH((((111100001111)))) to flush the ssssttttddddoooouuuutttt buffer from the application code.) Additionally, setting MMMMPPPPIIII____UUUUNNNNBBBBUUUUFFFFFFFFEEEERRRREEEEDDDD____SSSSTTTTDDDDIIIIOOOO allows an MPI program that emits very long output lines to execute correctly. NOTE: If MMMMPPPPIIII____UUUUNNNNBBBBUUUUFFFFFFFFEEEERRRREEEEDDDD____SSSSTTTTDDDDIIIIOOOO is set, the mmmmppppiiiirrrruuuunnnn ----pppprrrreeeeffffiiiixxxx option is ignored. Default: Not set MMMMPPPPIIII____UUUUSSSSEEEE____GGGGMMMM (IRIX systems only) Requires the MPI library to use the Myrinet (GM protocol) OS bypass driver as the interconnect when running across multiple hosts or running with multiple binaries. If a GM connection cannot be established among all hosts in the MPI job, the job is terminated. For more information, see the section titled "Default Interconnect Selection." Default: Not set MMMMPPPPIIII____UUUUSSSSEEEE____GGGGSSSSNNNN (IRIX 6.5.12 systems or later) Requires the MPI library to use the GSN (ST protocol) OS bypass driver as the interconnect when running across multiple hosts or running with multiple binaries. If a GSN connection cannot be established among all hosts in the MPI job, the job is terminated. GSN imposes a limit of one MPI process using GSN per CPU on a PPPPaaaaggggeeee 11115555 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) system. For example, on a 128-CPU system, you can run multiple MPI jobs, as long as the total number of MPI processes using the GSN bypass does not exceed 128. Once the maximum allowed MPI processes using GSN is reached, subsequent MPI jobs return an error to the user output, as in the following example: MMMMPPPPIIII:::: CCCCoooouuuulllldddd nnnnooootttt ccccoooonnnnnnnneeeecccctttt aaaallllllll pppprrrroooocccceeeesssssssseeeessss ttttoooo GGGGSSSSNNNN aaaaddddaaaapppptttteeeerrrrssss.... TTTThhhheeee mmmmaaaaxxxxiiiimmmmuuuummmm nnnnuuuummmmbbbbeeeerrrr ooooffff GGGGSSSSNNNN aaaaddddaaaapppptttteeeerrrr ccccoooonnnnnnnneeeeccccttttiiiioooonnnnssss ppppeeeerrrr ssssyyyysssstttteeeemmmm iiiissss nnnnoooorrrrmmmmaaaallllllllyyyy eeeeqqqquuuuaaaallll ttttoooo tttthhhheeee nnnnuuuummmmbbbbeeeerrrr ooooffff CCCCPPPPUUUUssss oooonnnn tttthhhheeee ssssyyyysssstttteeeemmmm.... If there are a few CPUs still available, but not enough to satisfy the entire MPI job, the error will still be issued and the MPI job terminated. For more information, see the section titled "Default Interconnect Selection." Default: Not set MMMMPPPPIIII____UUUUSSSSEEEE____HHHHIIIIPPPPPPPPIIII (IRIX systems only) Requires the MPI library to use the HiPPI 800 OS bypass driver as the interconnect when running across multiple hosts or running with multiple binaries. If a HiPPI connection cannot be established among all hosts in the MPI job, the job is terminated. For more information, see the section titled "Default Interconnect Selection." Default: Not set MMMMPPPPIIII____UUUUSSSSEEEE____TTTTCCCCPPPP Requires the MPI library to use the TCP/IP driver as the interconnect when running across multiple hosts or running with multiple binaries. For more information, see the section titled "Default Interconnect Selection." Default: Not set MMMMPPPPIIII____UUUUSSSSEEEE____XXXXPPPPMMMMEEEEMMMM (IRIX 6.5.13 systems or later) Requires the MPI library to use the XPMEM driver as the interconnect when running across multiple hosts or running with multiple binaries. This driver allows MPI processes running on one partition to communicate with MPI processes on a different partition via the NUMAlink network. The NUMAlink network is powered by block transfer engines (BTEs). BTE data transfers do not require processor resources. PPPPaaaaggggeeee 11116666 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) The XPMEM (cross partition) device driver is available only on Origin 3000 and Origin 300 systems running IRIX 6.5.13 or greater. NOTE: Due to possible MPI program hangs, you should not run MPI across partitions using the XPMEM driver on IRIX versions 6.5.13, 6.5.14, or 6.5.15. This problem has been resolved in IRIX version 6.5.16. If all of the hosts specified on the mmmmppppiiiirrrruuuunnnn command do not reside in the same partitioned system, you can select one additional interconnect via the MMMMPPPPIIII____UUUUSSSSEEEE variables. MPI communication between partitions will go through the XPMEM driver, and communication between non-partitioned hosts will go through the second interconnect. For more information, see the section titled "Default Interconnect Selection." Default: Not set MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____OOOONNNN Enables the XPMEM single-copy enhancements for processes residing on the same host. The XPMEM enhancements allow single-copy transfers for basic predefined MPI data types from any sender data location, including the stack and private heap. Without enabling XPMEM, single-copy is allowed only from data residing in the symmetric data, symmetric heap, or global heap. Both the MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____OOOONNNN and MMMMPPPPIIII____BBBBUUUUFFFFFFFFEEEERRRR____MMMMAAAAXXXX variables must be set to enable these enhancements. Both are disabled by default. If the following additional conditions are met, the block transfer engine (BTE) is invoked instead of bbbbccccooooppppyyyy, to provide increased bandwidth: * Send and receive buffers are cache-aligned. * Amount of data to transfer is greater than or equal to the MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____TTTTHHHHRRRREEEESSSSHHHHOOOOLLLLDDDD value. NOTE: The XPMEM driver does not support checkpoint/restart at this time. If you enable these XPMEM enhancements, you will not be able to checkpoint and restart your MPI job. The XPMEM single-copy enhancements require an Origin 3000 and Origin 300 servers running IRIX release 6.5.15 or greater. Default: Not set PPPPaaaaggggeeee 11117777 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____TTTTHHHHRRRREEEESSSSHHHHOOOOLLLLDDDD Specifies a minimum message size, in bytes, for which single-copy messages between processes residing on the same host will be transferred via the BTE, instead of bbbbccccooooppppyyyy. The following conditions must exist before the BTE transfer is invoked: * Single-copy mode is enabled (MMMMPPPPIIII____BBBBUUUUFFFFFFFFEEEERRRR____MMMMAAAAXXXX). * XPMEM single-copy enhancements are enabled (MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____OOOONNNN). * Send and receive buffers are cache-aligned. * Amount of data to transfer is greater than or equal to the MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____TTTTHHHHRRRREEEESSSSHHHHOOOOLLLLDDDD value. Default: 8192 MMMMPPPPIIII____XXXXPPPPMMMMEEEEMMMM____VVVVEEEERRRRBBBBOOOOSSSSEEEE Setting this variable allows additional MPI diagnostic information to be printed in the standard output stream. This information contains details about the XPMEM connections. Default: Not enabled PPPPAAAAGGGGEEEESSSSIIIIZZZZEEEE____DDDDAAAATTTTAAAA (IRIX systems only) Specifies the desired page size in kilobytes for program data areas. On Origin series systems, supported values include 16, 64, 256, 1024, and 4096. Specified values must be integer. NOTE: Setting MMMMPPPPIIII____DDDDSSSSMMMM____OOOOFFFFFFFF disables the ability to set the data pagesize via this shell variable. Default: Not enabled PPPPAAAAGGGGEEEESSSSIIIIZZZZEEEE____SSSSTTTTAAAACCCCKKKK (IRIX systems only) Specifies the desired page size in kilobytes for program stack areas. On Origin series systems, supported values include 16, 64, 256, 1024, and 4096. Specified values must be integer. NOTE: Setting MMMMPPPPIIII____DDDDSSSSMMMM____OOOOFFFFFFFF disables the ability to set the data page size via this shell variable. Default: Not enabled SSSSMMMMAAAA____GGGGLLLLOOOOBBBBAAAALLLL____AAAALLLLLLLLOOOOCCCC (IRIX systems only) Activates the LIBSMA based global heap facility. This variable is used by 64-bit MPI applications for certain internal optimizations, as well as support for the MMMMPPPPIIII____AAAAlllllllloooocccc____mmmmeeeemmmm function. For additional details, see the iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3) man page. Default: Not enabled PPPPaaaaggggeeee 11118888 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) SSSSMMMMAAAA____GGGGLLLLOOOOBBBBAAAALLLL____HHHHEEEEAAAAPPPP____SSSSIIIIZZZZEEEE (IRIX systems only) For 64-bit applications, specifies the per processes size of the LIBSMA global heap in bytes. Default: 33554432 bytes UUUUssssiiiinnnngggg aaaa CCCCPPPPUUUU LLLLiiiisssstttt You can manually select CPUs to use for an MPI application by setting the MMMMPPPPIIII____DDDDSSSSMMMM____CCCCPPPPUUUULLLLIIIISSSSTTTT shell variable. This setting is treated as a comma and/or hyphen delineated ordered list, specifying a mapping of MPI processes to CPUs. If running across multiple hosts, the per host components of the CPU list are delineated by colons. The shepherd process(es) and mmmmppppiiiirrrruuuunnnn are not included in this list. This feature will not be compatible with job migration features available in future IRIX releases. Examples: VVVVaaaalllluuuueeee CCCCPPPPUUUU AAAAssssssssiiiiggggnnnnmmmmeeeennnntttt 8888,,,,11116666,,,,33332222 Place three MPI processes on CPUs 8, 16, and 32. 33332222,,,,11116666,,,,8888 Place the MPI process rank zero on CPU 32, one on 16, and two on CPU 8. 8888----11115555,,,,33332222----33339999 Place the MPI processes 0 through 7 on CPUs 8 to 15. Place the MPI processes 8 through 15 on CPUs 32 to 39. 33339999----33332222,,,,8888----11115555 Place the MPI processes 0 through 7 on CPUs 39 to 32. Place the MPI processes 8 through 15 on CPUs 8 to 15. 8888----11115555::::11116666----22223333 Place the MPI processes 0 through 7 on the first host on CPUs 8 through 15. Place MPI processes 8 through 15 on CPUs 16 to 23 on the second host. Note that the process rank is the MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD rank. CPUs are associated with the ccccppppuuuunnnnuuuummmm values given in the hardware graph (hhhhwwwwggggrrrraaaapppphhhh(4)). The number of processors specified must equal the number of MPI processes (excluding the shepherd process) that will be used. The number of colon delineated parts of the list must equal the number of hosts used for the MPI job. If an error occurs in processing the CPU list, the default placement policy is used. This feature should not be used with MMMMPPPPIIII jobs running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode. UUUUssssiiiinnnngggg MMMMPPPPIIII wwwwiiiitttthhhh OOOOppppeeeennnnMMMMPPPP Hybrid MPI/OpenMP applications might require special memory placement features to operate efficiently on cc-NUMA Origin servers. A preliminary PPPPaaaaggggeeee 11119999 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) method for realizing this memory placement is available. The basic idea is to space out the MPI processes to accomodate the OpenMP threads associated with each MPI process. In addition, assuming a particular ordering of library iiiinnnniiiitttt code (see the DDDDSSSSOOOO(5) man page), procedures are employed to insure that the OpenMP threads remain close to the parent MPI process. This type of placement has been found to improve the performance of some hybrid applications significantly when more than four OpenMP threads are used by each MPI process. To take partial advantage of this placement option, the following requirements must be met: * The user must set the MMMMPPPPIIII____OOOOPPPPEEEENNNNMMMMPPPP____IIIINNNNTTTTEEEERRRROOOOPPPP shell variable when running the application. * The user must use a MIPSpro compiler and the ----mmmmpppp option to compile the application. This placement option is not available with other compilers. * The user must run the application on an Origin 300 or Origin 3000 series server. To take full advantage of this placement option, the user must be able to link the application such that the lllliiiibbbbmmmmppppiiii....ssssoooo iiiinnnniiiitttt code is run before the lllliiiibbbbmmmmpppp....ssssoooo iiiinnnniiiitttt code. This is done by linking the MPI/OpenMP application as follows: cccccccc ----66664444 ----mmmmpppp ccccoooommmmppppuuuutttteeee____mmmmpppp....cccc ----llllmmmmpppp ----llllmmmmppppiiii ffff77777777 ----66664444 ----mmmmpppp ccccoooommmmppppuuuutttteeee____mmmmpppp....ffff ----llllmmmmpppp ----llllmmmmppppiiii ffff99990000 ----66664444 ----mmmmpppp ccccoooommmmppppuuuutttteeee____mmmmpppp....ffff ----llllmmmmpppp ----llllmmmmppppiiii CCCCCCCC ----66664444 ----mmmmpppp ccccoooommmmppppuuuutttteeee____mmmmpppp....CCCC ----llllmmmmpppp ----llllmmmmppppiiii++++++++ ----llllmmmmppppiiii This linkage order insures that the lllliiiibbbbmmmmppppiiii....ssssoooo iiiinnnniiiitttt runs procedures for restricting the placement of OpenMP threads before the lllliiiibbbbmmmmpppp....ssssoooo iiiinnnniiiitttt is run. Note that this is not the default linkage if only the ----mmmmpppp option is specified on the link line. You can use an additional memory placement feature for hybrid MPI/OpenMP applications by using the MMMMPPPPIIII____DDDDSSSSMMMM____PPPPLLLLAAAACCCCEEEEMMMMEEEENNNNTTTT shell variable. Specification of a tttthhhhrrrreeeeaaaaddddrrrroooouuuunnnnddddrrrroooobbbbiiiinnnn policy results in the parent MPI process stack, data, and heap memory segments being spread across the nodes on which the child OpenMP threads are running. For more information, see the EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS section of this man page. MPI reserves nodes for this hybrid placement model based on the number of MPI processes and the number of OpenMP threads per process, rounded up to the nearest multiple of 4. For instance, if 6 OpenMP threads per MPI process are going to be used for a 4 MPI process job, MPI will request a placement for 32 (4 X 8) CPUs on the host machine. You should take this into account when requesting resources in a batch environment or when using ccccppppuuuusssseeeettttssss. In this implementation, it is assumed that all MPI processes start with the same number of OpenMP threads, as specified by PPPPaaaaggggeeee 22220000 MMMMPPPPIIII((((1111)))) MMMMPPPPIIII((((1111)))) the OOOOMMMMPPPP____NNNNUUUUMMMM____TTTTHHHHRRRREEEEAAAADDDDSSSS or equivalent shell variable at job startup. NOTE: This placement is not recommended when setting ____DDDDSSSSMMMM____PPPPPPPPMMMM to a non- default value (for more information, see ppppeeee____eeeennnnvvvviiiirrrroooonnnn(5)). This placement is also not recommended when running on a host with partially populated nodes. Also, if you are using MMMMPPPPIIII____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN, it is important to also set ____DDDDSSSSMMMM____MMMMUUUUSSSSTTTTRRRRUUUUNNNN to properly schedule the OpenMP threads. SSSSEEEEEEEE AAAALLLLSSSSOOOO mmmmppppiiiirrrruuuunnnn(1), sssshhhhmmmmeeeemmmm____iiiinnnnttttrrrroooo(1) aaaarrrrrrrraaaayyyydddd(1M) MMMMPPPPIIII____BBBBuuuuffffffffeeeerrrr____aaaattttttttaaaacccchhhh(3), MMMMPPPPIIII____BBBBuuuuffffffffeeeerrrr____ddddeeeettttaaaacccchhhh(3), MMMMPPPPIIII____IIIInnnniiiitttt(3), MMMMPPPPIIII____IIIIOOOO(3) aaaarrrrrrrraaaayyyydddd....ccccoooonnnnffff(4) aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss(5) For more information about using MPI, including optimization, see the _M_e_s_s_a_g_e _P_a_s_s_i_n_g _T_o_o_l_k_i_t: _M_P_I _P_r_o_g_r_a_m_m_e_r'_s _M_a_n_u_a_l. You can access this manual online at hhhhttttttttpppp::::////////tttteeeecccchhhhppppuuuubbbbssss....ssssggggiiii....ccccoooommmm. Man pages exist for every MPI subroutine and function, as well as for the mmmmppppiiiirrrruuuunnnn(1) command. Additional online information is available at hhhhttttttttpppp::::////////wwwwwwwwwwww....mmmmccccssss....aaaannnnllll....ggggoooovvvv////mmmmppppiiii, including a hypertext version of the standard, information on other libraries that use MPI, and pointers to other MPI resources. PPPPaaaaggggeeee 22221111